Fast Exact Shortest-Path Distance Queries on Large 
Networks by Pruned Landmark Labeling 
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ABSTRACT 

We propose a new exact method for shortest-path distance 
queries on large-scale networks. Our method precomputes 
distance labels for vertices by performing a breadth-first 
search from every vertex. Seemingly too obvious and too 
inefficient at first glance, the key ingredient introduced here 
is pruning during breadth-first searches. While we can still 
answer the correct distance for any pair of vertices from 
the labels, it surprisingly reduces the search space and sizes 
of labels. Moreover, we show that we can perform 32 or 
64 breadth-first searches simultaneously exploiting bitwise 
operations. We experimentally demonstrate that the com- 
bination of these two techniques is efficient and robust on 
various kinds of large-scale real-world networks. In particu- 
lar, our method can handle social networks and web graphs 
with hundreds of millions of edges, which are two orders of 
magnitude larger than the limits of previous exact methods, 
with comparable query time to those of previous methods. 

Categories and Subject Descriptors 

E.f [Data]: Data Structures — Graphs and networks 

General Terms 

Algorithms, Experimentation, Performance 
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1. INTRODUCTION 

A distance query asks the distance between two vertices 
in a graph. Without doubt, answering distance queries is 
one of the most fundamental operations on graphs, and it 
has wide range of applications. For example, on social net- 
works, distance between two users is considered to indicate 
the closeness, and used in socially-sensitive search to help 
users to find more related users or contents 40 , 42 , or to 
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analyze influential people and communities [19(16] . On web 
graphs, distance between web pages is one of indicators of 
relevance, and used in context-aware search to give higher 
ranks to web pages more related to the currently visiting 
web page |39II29| . Other applications of distance queries in- 
clude top-fc keyword queries on linked data [16I3TJ . discovery 
of optimal pathways between compounds in metabolic net- 
works [311I32J . and management of resources in computer 
networks [28117] . 

Of course, we can compute the distance for each query by 
using a breadth first search (BFS) or Dijkstra's algorithm. 
However, they take more than a second for large graphs, 
which is too slow to use as a building block of these appli- 
cations. In particular, applications such as socially-sensitive 
search or context-aware search should have low latency since 
they involve real-time interactions between users, while they 
need distances between a number of pairs of vertices to rank 
items for each search query. Therefore, distance queries 
should be answered much more quickly, say, microseconds. 

The other extreme approach is to compute distances be- 
tween all pairs of vertices beforehand and store them in an 
index. Though we can answer distance queries instantly, 
this approach is also unacceptable since preprocessing time 
and index size are quadratic and unrealistically large. Due 
to the emergence of huge graph data, design of more mod- 
erate and practical methods between these two extreme ap- 
proaches has been attracting strong interest in the database 
community [Tl[2l[4Tl[38l[l[3Ql[IZl • 

Generally, there are two major graph classes of real- world 
networks: one is road networks, and the other is complex 
networks such as social networks, web graphs, biological net- 
works and computer networks. For road networks, since it is 
easier to grasp and exploit structures of them, research has 
been already very successful. Now distance queries on road 
networks can be processed in less than one microsecond for 
the complete road network of the USA [TJ. 

fn contrast, answering distance queries on complex net- 
works is still a highly challenging problem. The methods 
for road networks do not perform well on these networks 
since structures of them are totally different. Several meth- 
ods have been proposed for these networks, but they suffer 
from drawback of scalability. They take at least thousands 
of seconds or tens of thousands of seconds to index networks 
with millions of edges [mgHTTj ■ 

To handle larger complex networks, apart from these exact 
methods, approximate methods are also studied. That is, we 
do not always have to answer correct distances. They are 
successful in terms of much better scalability and very small 



average relative error for random queries. However, some of 
these methods take milliseconds to answer queries [1511381 
130] . which is about three orders of magnitude slower than 
other methods. Some other methods answer queries in mi- 
croseconds [2911 40| , but it is reported that precision of these 
methods for close pairs of vertices is not high [30114] . This 
drawback might be critical for applications such as socially- 
sensitive search or context-aware search since, in these ap- 
plications, distance queries are employed to distinguish close 
items. 

1.1 Our Contributions 

To address these issues, in this paper, we present a new 
method for answering distance queries in complex networks. 
The proposed method is an exact method. That is, it always 
answers exactly correct distance to queries. It has much bet- 
ter scalability than previous exact methods and can handle 
graphs with hundreds of millions of edges. Nevertheless, 
the query time is very small and around ten microseconds. 
Though our method can handle directed and/or weighted 
graphs as we mention later, in the following, we assume 
undirected, unweighted graphs for simplicity of exposition. 

Our method is based on the notion of distance labeling 
or distance-aware 2-hop cover. The idea of 2-hop cover is 
as follows. For each vertex u, we pick up a set C(u) of 
candidate vertices so that every pair of vertices (u, v) has 
at least one vertex w G C(u) n C(v) on a shortest path 
between the pair. For each vertex u and a vertex w G C(u), 
we precompute the distance dc(u,w) between them. We 
say that the set L(u) — {(w,dG(u,w))} we c(u) is the label 
of u. Using labels, it is clear that the distance da(u,v) 
between two vertices u and v can be computed as mm{S+5' | 
{w, 5) G L(u),(w,S') G L(v)}. The family of labels {L(u)} 
is called a 2-hop cover. Distance labeling is also commonly 
used in previous exact methods [13I12II2I1TJ . but we propose 
a totally new and different approach to compute the labels, 
referred to as the pruned landmark labeling. 

The idea of our method is simple and rather radical: from 
every vertex, we conduct a breadth-first search and add the 
distance information to labels of visited vertices. Of course, 
if we naively implement this idea, we need 0(nm) prepro- 
cessing time and 0(n 2 ) space to store the index, which is 
unacceptable. Here, n is the number of vertices and m is the 
number of edges. Our key idea to make this method practi- 
cal is pruning during the breadth-first searches. Let S be a 
set of vertices and suppose that we already have labels that 
can answer correct distance between two vertices if a shortest 
path between them passes through a vertex in S. Suppose 
we are conducting a BFS from v and visiting u. If there is 
a vertex w G S such that cfc(v, u) = dG(v,w) + da(w,u), 
then we prune u. That is, we do not traverse any edges from 
u. As we prove in Section [4.31 after this pruned BFS from 
v, the labels can answer the distance between two vertices 
if a shortest path between them passes through a vertex in 
SU{v}. 

Interestingly, our method combines the advantages of three 
different previous successful approaches: landmark-based 
approximate methods [2911381150] , tree-decomposition-based 
exact methods [41 1|4] , and labeling-based exact methods [131 
I12| |2"|. Landmark-based approximate methods achieve re- 
markable precision by leveraging the existence of highly cen- 
tral vertices in complex networks [29[ . This fact is also 
the main reason of the power of our pruning: by conduct- 



Table 1: Summary of experimental results of previous meth- 
ods and the proposed method for exact distance queries. 



Method 


Network 


\V\ 


\E\ 


Indexing 


Query 


TEDI 


Computer 
Social 


22 K 
0.6 M 


46 K 
0.6 M 


17 s 
2,226 s 


4.2 (is 
55.0 /L4S 


HCL 


Social 
Citation 


7.1 K 
0.7 M 


0.1 M 
0.3 M 


1,003 s 
253,104 s 


28.2 /is 
0.2 /us 


TD 

® 


Social 
Social 


0.3 M 
2.4 M 


0.4 M 
4.7 M 


9 s 
2,473 s 


0.5 /js 
0.8 /lis 


HHL 

m 


Computer 
Social 


0.2 M 
0.3 M 


1.2 M 
1.9 M 


7,399 s 
19,488 s 


3.1 /is 
6.9 /js 


PLL 

(this work) 


Web 
Social 


0.3 M 
2.4 M 


1.5 M 
4.7 M 


4 s 
61 s 


0.5 /is 
0.6 /is 


Social 
Web 


1.1 M 

7.4 M 


114 M 
194 M 


15,164 s 
6,068 s 


15.6 /is 

4.1 /is 



ing breadth-first searches from these central vertices first, 
later we can drastically prune breadth-first searches. Tree- 
decomposition-based methods exploit the core-fringe struc- 
ture of networks |10l[27| by decomposing tree-like fringes 
of low tree-width. Though our method does not explicitly 
use tree decompositions, we prove that our method can effi- 
ciently process graphs of small tree- width. This process indi- 
cates that our method also exploits the core-fringe structure. 
As with other labeling-based methods, the data structure of 
our index is simple and query processing is very quick be- 
cause of the locality of memory access. 

Though this pruned landmark labeling scheme is already 
powerful by itself, we propose another labeling scheme with 
a different kind of strength and combine them to further 
improve the performance. That is, we show that labeling 
by breadth-first search can be implemented in a bit-parallel 
way, which exploits the property that the number of bits 
b in a register word is typically 32 or 64 and we can per- 
form bit manipulations on these b bits simultaneously. By 
this technique, we can perform BFSs from 6+1 vertices 
simultaneously in 0(m) time. In the beginning, this bit- 
parallel labeling (without pruning) works better than the 
pruned landmark labeling since pruning does not happen 
much. Note that we are not talking about thread-level par- 
allelism, and our bit-parallelism actually decreases the com- 
putational complexity by the factor of b + 1. We can also 
use thread-level parallelism in addition to these two labeling 
schemes. 

As we confirm in our experimental results, our method 
outperforms other state-of-the-art methods for exact dis- 
tance queries. In particular, it has significantly better scal- 
ability than previous methods. It took only tens of seconds 
for indexing networks with millions of edges. This indexing 
time is two orders of magnitude faster than previous meth- 
ods, which took at least thousands of seconds or even more 
than one day. Moreover, our method successfully handled 
networks with hundreds of millions of edges, which is again 
two orders of magnitude larger than networks that have been 
previously used in experiments of exact methods. The query 
time is also better than previous methods for networks with 
the same size, and we confirmed that the query time does 
not increase rapidly against sizes of networks. We also con- 
firm the size of an index of our method is comparable to 
other methods. 

In Table [TJ we summarize our experimental results and 
those of previous exact methods presented in these papers. 
We listed the results for the largest two real-world complex 



networks from each paper. In our experiments, we further 
compare our method with hierarchical hub labeling [2] and 
the tree-decomposition-based method [3]. 

In Section [2] we describe related works on exact and ap- 
proximate distance queries. In Section [jj] we give definitions 
and notions used in this paper. Section [4] is devoted to ex- 
plain our first scheme, the pruned landmark labeling. We 
explain our second scheme, the bit-parallel labeling, in Sec- 
tion [5] In Section [6] we mention variants of distance queries 
we can handle by slightly modifying our method. We ex- 
plain our experimental results in Section [7] and conclude in 
Section [5] 

2. RELATED WORK 

2.1 Exact Methods 

For exact distance queries on complex networks such as 
social networks and web graphs, several methods are pro- 
posed recently. 

Large portion of these methods can be considered as based 
on the idea of 2-hop cover [13] • Finding small 2-hop covers 
efficiently is a challenging and long-standing problem [131121 
[2J. One of the latest methods is hierarchical hub labeling [2j, 
which is based on a method for road networks pQ. Another 
latest method related to 2-hop cover is highway-centric la- 
beling [17] ■ In this method, we first compute a spanning 
tree T and use it as a "highway". That is, when computing 
distance d G (u,v) between two vertices u and v, we output 
the minimum over d G (u, Wi) + dT(vJi,W2) + da('W2 : v) where 
wi and W2 are vertices in labels of u and v, respectively, and 
dr(-,-) is the distance metric on the spanning tree T. 

An approach based on tree decompositions is also reported 
to be efficient [41114] . A tree decomposition of a graph G is a 
tree T with each vertex associated with a set of vertices in G, 
called a bag [35]. Also, the set of bags containing a vertex 
in G forms a connected component in T. It heuristically 
computes a tree decompositions and stores shortest-distance 
matrices for each bag. It is not hard to compute distances 
from this information. The smaller the size of the largest bag 
is, the more efficient this method is. Because of the core- 
fringe structure of the networks [101127] , these networks can 
be decomposed into one big bag and many small bags, and 
the size of the largest bag is moderate though not small. 

2.2 Approximate Methods 

To gain more scalability than these exact methods, ap- 
proximate methods, which do not always answer correct dis- 
tances, also have been studied. 

The major approach is the landmark-based approach [361 
140] . The basic idea of these methods is to select a subset L of 
vertices as landmarks, and precompute the distance d G (£, u) 
between each landmark £ G L and all the vertices u G V . 
When the distance between two vertices u and v is queried, 
we answer the minimum da(u, £) + d G (£, v) over landmarks 
I 6 I as an estimate. Generally, the precision for each 
query depends on whether actual shortest paths pass nearby 
the landmarks. Therefore, by selecting central vertices as 
landmarks, the accuracy of estimates becomes much better 
than selecting landmarks randomly [291111] , However, for 
close pairs, the precision is still much worse than the average, 
since lengths of shortest paths between them are small and 
they are unlikely to pass nearby the landmarks [3]. 

To further improve the accuracy, several techniques were 



Table 2: Frequently used notations. 



Notation 


Description 


G = (V, E) 


A graph 


n 


Number of vertices in graph G 


m 


Number of edges in graph G 


N G (v) 


Neighbors of vertex v in graph G 


d G (u,v) 


Distance between vertex u and v in graph G 


P G (u,v) 


Set of all the vertices on the shortest paths 
between vertex u and v in graph G 



proposed [15138130] . They typically store shortest-path trees 
rooted at the landmarks instead of just storing distances 
from the landmarks. To answer queries, they extract paths 
from the shortest-path trees as candidates of shortest-paths, 
and improve them by finding loops or shortcuts. While they 
significantly improve the accuracy, the query time becomes 
up to three orders of magnitude slower. 

3. PRELIMINARIES 
3.1 Notations 

Table [2] lists the notations that are frequently used in this 
paper. In this paper, we mainly focus on networks that are 
modeled as graphs. Let G — (V, E) be a graph with vertex 
set V and edge set E. We use symbols n and m to denote 
the number of vertices \V\ and the number of edges \E\, 
respectively, when the graph is clear from the context. We 
also denote the vertex set of G by V(G) and the edge set of 
G by E(G). We denote the neighbors of a vertex v G V by 
N G (v). That is, N G (v) = {u G V \ (u, v) G E}. 

Let d G (u,v) denote the distance between vertices u,v. If 
u and v are disconnected in G, we define dc(u, v) = co. The 
distance in graphs is a metric, thus it satisfies the triangle 
inequalities. That is, for any three vertices s, t and v, 



d>G(s, t) < d G {s, v) + d G {v, i), 
d G (s, t) > \d G (s, v) - d G (v, t)\ 



(1) 
(2) 



We define P G (s,t) C V as the set of all vertices on the 
shortest paths between vertices s and t. In other words, 

P G (s, t) = {v G V d G (s, v) + d G (v, t) = d G {s, t)} . 

3.2 Problem Definition 

This paper studies the following problem: given a graph 
G, construct an index to efficiently answer distance queries, 
which asks the distance between an arbitrary pair of vertices. 

For simplicity of exposition, we mainly consider undi- 
rected, unweighted graphs. However, our algorithm can be 
easily extended for directed and/or weighted graphs, and 
we discuss about this extension in Section [6] Furthermore, 
our method can answer not only distances but also shortest- 
paths. This extension is also discussed in Section [6] 

3.3 Labels and 2-Hop Cover 

The general framework of 2-hop cover |13II12| |2], or some- 
times called a labeling method, is as follows. Our method 
also follows this framework. 

For each vertex v, we precompute a label denoted as L(v), 
which is a set of pairs (u, 5 UV ), where u is a vertex and S uv = 
d G (u,v). We sometimes call the set of labels {L(v)} ve v as 
an index. To answer a distance query between vertices s and 



t, we compute and answer Query(s,£, L) denned as follows, 

Query(s,£,L) = 

min {8va + 8vt \ (v, S vs ) G L(s), (v, S v t) G L(t)} . 

We define Query(s,£, L) = co if L(s) and L(t) do not share 
any vertex. We call L a (distance- aw are) 2-hop cover of G 
if Query(s, t, L) = d G (s, t) for any pair of vertices s and t. 
For each vertex v, we store the label L(v) so that pairs 
in it are sorted by their vertices. Then, we can compute 
Query(s,£,L) in 0(\L(s)\ + \L(t)\) time using a merge-join- 
like algorithm. 

4. ALGORITHM DESCRIPTION 

4.1 Naive Landmark Labeling 

We start with the following naive method. As the in- 
dex, we conduct a BFS from each vertex and store distances 
between all pairs. Though this method is too obvious and 
inefficient, for the exposition of the next method, we explain 
the details. 

Let V = {1)1,112, ■ ■ ■ ,v„}. We start with an empty index 
Lo, where Lq(u) = for every u 6 V. Suppose we con- 
duct BFSs from vertices in the order of Vi, V2, ■ ■ ■ , v n - After 
the fc-th BFS from a vertex Vk, we add distances from Vk 
to labels of reached vertices, that is, L k (u) = L*,_i(it) U 
{{v k ,d G (v k ,u))} for each u G V with d G (v k ,u) / 00. We 
do not change labels for unreached vertices, that is, Lk (it) = 
L k -i(u) for every u G V with d G (v k ,u) = 00. 

L n is the final index. Obviously Query(s, t, L n ) — d G (s, t) 
for any pair of vertices s and t, and therefore, L n is a cor- 
rect 2-hop cover for exact distance queries. This is be- 
cause, if s and t are reachable, then (s,0) G L n (s) and 
(s,dc{s,t)) G L n (t) for example. 

This method can be considered as a variant of landmark- 
based approximate methods, which we mentioned in Sec- 
tion [2]2] The standard landmark-based method can be re- 
garded as a method that precomputes Li instead of L n and 
estimates distance between s and t by Query(s, t,Li), where 
I C n is a parameter expressing the number of landmarks. 

4.2 Pruned Landmark Labeling 

Then, we introduce pruning to the naive method. Simi- 
larly to the method above, we conduct pruned BFSs from 
vertices in the order of «i , V2 , • ■ • , «»■ We start with an 
empty index L' and create an index L' k from L' k _ 1 using 
the information obtained by the fc-th pruned BFS from ver- 
tex Vk- 

We prune BFSs as follows. Suppose that we have an index 
L' k _ l and we are conducting a BFS from Vk to create a new 
index L' k . Suppose that we are visiting a vertex u with 
distance 8. If QuERY(t>fc, u, L'k-i) < 8, then we prune u, that 
is, we do not add (v k ,S) to L' k (u) (i.e. L' k (u) = L' k _ l (u)) 
and we do not traverse any edge from vertex u. Otherwise, 
we set L' k (u) — L' k _ 1 (u)u{(v k , 8)} and traverse all the edges 
from the vertex u as usual. As with the previous method, we 
also set L' k {u) = L' k _ 1 (u) for all vertices u G V that were not 
visited in the fc-th pruned BFS. This algorithm, performing 
pruned BFSs, is described as Algorithm [T] and the whole 
preprocessing algorithm is described as Algorithm [2] 

Figure[T]shows examples of pruned BFSs. The first pruned 
BFS from vertex 1 visits all the vertices (Figure [Taj. During 
the next pruned BFS from vertex 2 (Figure llb|l , when we 



Algorithm 1 Pruned BFS from v k G V to create index L' k 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 



procedure PrunedBFS(G, v k , L' k _ l ) 
Q -J— a queue with only one element v k . 
P[v k ] «- and P[v] <~ 00 for all v G V(G) \ {v k }. 

L' h [v]<- Li_j.lv] for all veV(G). 
while Q is not empty do 

Dequeue u from Q. 

if QuERY(u fc , it, L' k _ 1 ) < P[u] then 
continue 

Ll[«]<-4-iHu{(« t ,PK])} 

for all w G Ng{v) s.t. P[w] = 00 do 
P[w] <- P[u] + 1. 
Enqueue w onto Q. 
return L' k 



Algorithm 2 Compute a 2-hop cover index by pruned BFS. 



procedure Preprocess(G) 
L'olv] <- for all v G V(G). 
for fc = 1, 2, . . . , n do 

L' k <- PrunedBFS(G, v k , L' k . 
return L' 



visit vertex 6, since Query(2, 6, L[) = dc(2, 1) + <fc(l, 6) = 
3 = do(2, 6), we prune vertex 6 and we do not traverse edges 
from it. We also prune vertices 1 and 12. As the number of 
performed BFSs increases, we can confirm that the search 
space gets smaller and smaller (Figure I lcl ldl and I le[) . 

4.3 Proof of Correctness 

In the following, we prove that this method computes a 
correct 2-hop cover index, that is, Query(s, t, L' n ) = dc(s, t) 
for any pair of vertices s and t. 

Theorem 4.1. For any < fc < n and for any pair of 
vertices s and t, Query(s,£, L' k ) = Query(s, t,L k ). 

Proof. We prove the theorem by mathematical induc- 
tion on fc. Since L' = Lo, it is true for fc = 0. Now we 
assume it holds for 0, 1, . . . , fc — 1 and prove it also holds for 
fc. 

Let s,t be a pair of vertices. We assume these vertices 
are reachable in G, since otherwise the answer 00 can be 
obviously obtained. Let j be the smallest number such 
that (vj,S Vj3 ) G L k (s),(vj,S Vjt ) G L k (t) and S VjS + 8 Vjt = 
QuERY(s,i,Lfc). We prove that (vj,8 VjS ) and (Vj,8 Vj t) are 
also included in L' k (s) and L' k (t). This immediately leads 
to QuERY(s,t, L' k ) — Query(s, t, Lk). Due to the symmetry 
between s and t, we prove (vj,8 V3 ) G L' k (s). 

First, for any i < j, we prove by contradiction that Vi g" 



Pg(vj,s). If we assume Vi G Pg{vj,s), from Inequality [TJ 

QuERY(s,£,L fc ) = d G (s,Vj) +d G {vj,t) 

= d G {s,Vi) + d G (vi,Vj) + d G (vj,t) 
> d G (s,Vi) + d G {v t ,t). 

Since {v % ,d G (s,Vi)) G L k (s) and {vi,d G (t,Vi)) G L k (t), this 
contradicts to the assumption of the minimality of j. There- 
fore, Vi P g (vj,s) holds for any i < j. 

Now we prove that (vj,d G (vj,s)) G L' k (s). Actually, we 
prove a more general fact: (vj,d G (vj,u)) G L' k (u) for all 
it G P g {vj,s). Note that s G P g (vj,s). Suppose that we 







(a) First BFS from ver- 
tex 1. We visited all the 
vertices. 



(b) Second BFS from 
vertex 2. We did not 
add labels to five ver- 
tices. 



(c) Third BFS from 
vertex 3. We only vis- 
ited the lower half of 
the vertices. 



(d) Fourth BFS from 
vertex 4. This time we 
only visited the higher 
half. 




(e) Fifth BFS from ver- 
tex 5. The search space 
was even smaller. 



Figure 1: Examples of pruned BFSs. Yellow vertices denote the roots, blue vertices denote those which we visited and labeled, 
red vertices denote those which we visited but pruned, and gray vertices denote those which are already used as roots. 



are conducting the j-th pruned BFS from Vj to create Lj. 
Let u € Pa(vj,s). Since Pg(vj,u) C Pq(vj,s) and Vi 
Pa{vj,s) for any i < j, we have Vi Pg{vj,u) for any 
i < j. Therefore, Query(uj, u, L'j_i) > <1g{vj,u) holds. 
Thus, we visit all vertices u £ Pg(vj,s) without pruning, 
and it follows that (vj,oIg{vj,u)) £ Lj(u) C L' k (u). □ 

As a corollary, our method is proved to be an exact dis- 
tance querying method by instantiating the theorem with 
k = n. 

Corollary 4.1. For any pair of vertices s and t, 
Query(s, t,L' n ) — <Ig(s, t). 

4.4 Vertex Ordering Strategies 

4.4.1 Motivation 

In the algorithm description above, we conducted pruned 
BFSs from vertices in the order of vi, 1)2, ■ ■ ■ , v n . We can 
freely choose the order, and moreover it turns out that the 
order is crucial for the performance of this method as we will 
see in the experimental results presented in Section T7.3. 41 

To decide the order of vertices, we should select central 
vertices first in the sense that many shortest paths pass 
through these vertices. Since we would like to prune later 
BFSs as much as possible, we want to cover larger part of 
pairs of vertices by earlier BFSs. That is, the earlier labels 
should offer correct distances for as many pairs of vertices as 
possible, and therefore the earlier vertices should be those 
who many shortest paths passes through. 

This problem is quite similar to the problem of selecting 
good landmarks for landmark-based approximate methods, 
which is discussed well in [29]. In that problem, we also 
want to select good landmarks so that many shortest path 
passes through these vertices or nearby vertices. 

4.4.2 Strategies 

Based on the results on landmark-based methods [29], we 
propose and examine the following three strategies. In ex- 
periments, we basically use the Degree strategy, and com- 
pare them empirically in Section \7. 3. 41 

Random: We order vertices randomly. We use this method 
as a baseline to show the significance of other strategies. 

Degree: We order vertices from those with higher degree. 
The idea behind this strategy is that vertices with higher 
degree have stronger connection to many other vertices and 
therefore many shortest paths would pass through them. 



Closeness: We order vertices from those with the highest 
closeness centrality. Since computing exact closeness cen- 
trality for all vertices costs 0(nm) time, which is too ex- 
pensive for large-scale networks, we approximate closeness 
centrality by randomly sampling a small number of vertices 
and computing distances from those vertices to all vertices. 

4.5 Efficient Implementation 

4. 5. 1 Preprocessing (Algorithm \J} 
Index: First, in the description above, we treated L' h _ 1 and 
L' k separately and explained as if we copy L' k _ 1 to L' k for 
simplicity of explanation. However, this copy can be easily 
avoided by keeping only one index and adding labels to it 
after each pruned BFS. 

Initialization: Another important note is to avoid 0(n) 
time initialization for each pruned BFS. The reason why this 
method is efficient is that the search space of pruned BFSs 
gets much more smaller than the whole graph. However 
if we spend 0(n) time for initialization, it would be the 
bottleneck. What we want to do in the initialization is to 
set all values in the array storing tentative distances as 00 
(Line[3]). We can avoid 0(n) time initialization as follows. 
Before we conduct the first pruned BFS, we set all values in 
the array P as 00. (This takes 0(n) time but we do this only 
once.) Then, during each pruned BFS, we store all vertices 
we visited, and after each pruned BFS, we set P[v] as 00 for 
all each vertex v we have visited. 

Arrays: For the array storing tentative distances, it is bet- 
ter to use 8-bit integers. Since networks of our interest are 
small-world networks, 8-bit integers are enough to represent 
distances. Using 8-bit integers, the array fits into low-level 
cache memories of recent computers, resulting in the speed 
up by reducing cache misses. 

Querying: To evaluate queries for pruning (Line [7]), it is 
faster to use an algorithm different from the normal one 
since we can exploit the fact here that we issue many queries 
whose one endpoint is always v k . Before starting the fc-th 
pruned BFS from v k , we prepare an array T of length n 



initialized with 00 and set T[w] 



for all (w, S u 



L' k _ 1 (v k ). To evaluate QuERY(ufc, it, L' k _\), for all (w, S wu ) 6 
L' k _ 1 (u), we compute S wu + T[w] and return the minimum. 
Though normal querying algorithm takes 0(\L' k _ 1 (v k )\ + 
\L' k _±(u)\) time, this algorithm runs in 0(\L' k _ 1 (u)\) time. 
As Line [7] is the bottleneck of the algorithm, this technique 
speeds up preprocessing by about twice. Note that T should 



be represented by 8-bit integers as the same reason described 
above, and 0(n) time initialization for array T should be 
avoided in the same way for array P. 

Prefetching: Unfortunately, we cannot fit the index and 
the adjacency lists into the cache memory for large-scale net- 
works. However, we can manually prefetch them to reduce 
the cache misses, since vertices which we will access soon are 
in the queue. Manual prefetching speeds up preprocessing 
by about 20%. 

Thread-Level Parallelism: As with parallel BFS algo- 
rithms [3], the pruned BFS algorithm can be also paral- 
lelized. However, for simple experimental analysis and fair 
comparison to previous methods, we did not parallelize our 
implementation in the experiments. 

Sorting Labels: When applying merge-join-like algorithms 
to answer queries, pairs in labels need to be sorted by ver- 
tices. However, actually we do not need to sort explicitly by 
storing ranks of vertices instead of vertices. That is, when 
adding a pair (u, 5) in the i-th pruned BFS from vertex u, 
we add a pair (i, 8) instead. Then, since pairs are added 
from vertices with lower rank to those with higher rank, all 
the labels are automatically sorted. 

4.5.2 Querying 

Sentinel: We add a dummy entry, (n, oo), to the label L(v) 
for each v £ V. This dummy entry ensures that we find the 
same vertices, n, in the end when scanning two labels. Thus 
we can avoid to separately test whether we have scanned till 
the end. 

Arrays: For each label L(v), it is faster to store the array 
for vertices and the array for distances separately since dis- 
tances are only used when vertices match [T]. We also align 
arrays to cache lines. 

4.6 Theoretical Properties 

4.6.1 Min ima lity 

Theorem 4.2. Let L' n be the index defined in Section \4^ 
L n is minimal in the sense that, for any vertex v and for any 
pair (u,5uv) £ L' n (v), there is a pair of vertices (s,t) such 
that, if we remove (u,5 uv ) from L'„(v), we cannot answer 
the correct distance between s and t. 

PROOF. Let v t e V and {vj,8 VVi ) £ L' n (vi). This implies 
j < i. We show that if we remove (vj,S VjVi ) from L' n (vi) 
then we cannot answer the correct distance between Vi and 
Vj. We claim that, for any k ^ j, either (i) (vk,5 VkVi ) (jL 
L' n (vi) or (vk,dv k vj) L' n (vj) holds, or (ii) d G {vi,v k ) + 
dc(vk,Vj) > dc(vi,Vj) holds. Suppose k < j and assume 
that (ii) does not hold. Then, (i) must hold since otherwise 
the jf-th BFS should have pruned vertex Vi and (vj, 8 v . Vi ) 
L' n (vi). Suppose k > j and assume that (ii) does not hold. 
Then, Vk £ Pg{vi,vj) and therefore (vj , 5 Vj Vk ) £ L'j(vk), 
thus the fc-th BFS prunes vertex Vj, leading to (vk, S Vk v) $. 
L' n (vj). □ 

4. 6. 2 Exploiting Existence of Highly Central Vertices 

Then, we compare our method with landmark-based meth- 
ods to show that our method also can exploit the existence of 
highly central vertices. We consider the standard landmark- 
based method |29II40| . which do not use any path heuristics. 
As we stated in Section 12.21 by selecting central vertices as 



landmarks, it attains remarkable average precision for real- 
world networks. From the following theorem, we can observe 
that our method is efficient for networks whose distance can 
be answered by landmark-based methods with such high pre- 
cision, and our method also can exploit the existence of these 
central vertices. 

Theorem 4.3. If we assume that the standard landmark- 
based approximate method can answer correct distances to 
(1 — e)n pairs (out of n pairs) using k landmarks, then the 
pruned landmark labeling method gives an index with average 
label size 0(k + en). 

Proof Sketch. After conducting pruned BFSs from the 
k landmark vertices first, at most en 2 pairs are added to the 
index in total, since we never add pairs whose distance can 
be answered from current labels. □ 

4.6.3 Exploiting Small Tree -width of Fringes 

Finally, we show a theoretical evidence that our method 
can also exploit tree-like fringes efficiently. As we men- 
tioned in Section 12.11 methods based on tree decomposi- 
tions were proposed for distance queries |41ll4"]. Both of 
them extend methods which work efficiently for graphs of 
small tree-width, and they exploit low tree-width of fringes 
in real- world networks by tree decompositions. Interestingly, 
though we do not use tree decompositions explicitly, we can 
prove that our method can efficiently process graphs of small 
tree-width. Thus, our method implicitly exploits this prop- 
erty of real- world networks too. For definitions of tree- width 
and tree decompositions, see |35| . 

Theorem 4.4. Let w be the tree-width of G. There is an 
order of vertices with which the pruned landmark labeling 
method takes 0(wm log n + w 2 n log 2 n) time for preprocess- 
ing, stores an index with O(uinlogn) space, and answers 
each query in 0(w log n) time. 

Proof Sketch. The key ingredient is the centroid de- 
composition [18] of the tree decomposition. First we con- 
duct pruned BFSs from all the vertices in a centroid bag. 
Then, later pruned BFSs never go beyond the bag. There- 
fore, we can consider as we divided the tree decomposition 
into disjoint components, each having at most half of the 
bags. We recursively repeat this procedure. The number 
of recurrences is at most O(logn). Since we add at most w 
pairs to each vertex at each depth of recursion, the number 
of pairs in each label is C>(u;logn). At each depth of re- 
cursion, the total time consumed by pruned BFSs from the 
current components is 0(wm + w 2 nlogn), where Oiwm) is 
the time for traversing edges and 0(w 2 n log n) is the time 
for pruning tests. □ 

5. BIT-PARALLEL LABELING 

To further speed up both preprocessing and querying, we 
propose an optimizing method which exploits bit-level par- 
allelism. Bit-parallel methods are those that perform differ- 
ent calculations on different bits in the same word to exploit 
the fact that computers can perform bitwise operations on 
a word at once. The word length is commonly 32 or 64 in 
computers of the day. 

In the following, we denote the number of bits in a com- 
puter word as b and assume bitwise operations on bit vectors 
of length b can be done in O(l) time. We propose an algo- 
rithm to conduct BFSs and compute labels from 6+1 roots 



Algorithm 3 Bit-parallel BFS from r £ V and S r C N G (r). 
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procedure Bp-BFS(G, r, S r ) 

(P[v], Sr 1 ^], S?[v]) <- (oo,0,0) for all u e V 

(P[r],s;r 1 M,.s?[r])<- (0,0,0) 

(P[v], S- 1 ^], S°M) <- (1, M , 0) for all veSr 
Qo,Qi <— an empty queue 
Enqueue r onto Qo 
Enqueue v onto Qi for all v £ S r 
while Qo is not empty do 
S «- and Si <- 
while Qo is not empty do 
Dequeue v from Qo. 
for all u £ Ng(v) do 

if P[«] = oo V P[u] = P[u] + 1 then 
Si <-SiU{(u,u)} 
if P[u] — oo then 

P[u] <r- P[v] + 1 

Enqueue u onto Qi. 
else if P[u] = P[u] then 
So <-S U{(u,u)} 
for all (w,u) £ So do 

for all (v,u) € Si do 

S°[u] *-S°[u]US°[v] 
Qo *- Qi and Qi <- 
return (PS" 1 ,^) 



simultaneously in 0(m) time. Moreover, we also propose a 
method to answer distance queries for any pair of vertices 
via one of these b + 1 vertices in O(l) time. 

5.1 Bit-parallel Labels 

To describe the preprocessing algorithm and the querying 
algorithm, we first define what we store in the index. 

As we explain in the next subsection, we conduct bit- 
parallel BFSs from a vertex r and a subset of its neighbors 
S r C Ng(t) with size at most b. We define 

S r ( v ) = {u £ S r \ da{u,v) — dc(r,v) = i] . 

Since vertices in S r are neighbors of r, for any vertex u £ S r 
and any vertex v £ V, \da(u, v) — da(r,v)\ < 1. Therefore, 
for each v £ V, S r can be partitioned into S^ (v), 5°(«), 
and S+^v). That is, S' 1 ^) U S°(v) U S+^v) = S r . 

We compute bit-parallel labels and store them in the in- 
dex. For each vertex v £ V, we precompute a bit-parallel 
label denoted as Lbp(v). Lbp(v) is a set of quadruples 
(u, 5 nv , Su 1 (v),S%,(v)), where u £ V is a vertex, 5 UU = 
do(u, v) and £„(«) C 5*„ is defined above. We store S^ (v) 
and S^(v) by bit vectors of b bits. Note that 5„ 1 (w) can be 
obtained as 5* u \ (S 1 7 1 (u) U S2(v)), but actually we do not 
use S^iv) in the querying algorithm. 

In order to describe subsets of S r by bit vectors of b bits, 
we assign an unique number between one and \S r \ to each 
vertex in S r , and express presence of the i-th vertex by set- 
ting the i-th bit. 

5.2 Bit-parallel BFS 

We once put aside the pruning discussed in Section r4T2l and 
we make a bit-parallel version of the naive labeling method 



discussed in Section T4. II We introduce pruning later in Sec- 
tion E31 

Let r £ V be a vertex and S r C Na(r) be a subset of 
neighbors of r with size at most b. We explain an algorithm 
to compute da(r,v), Sr 1 ^) and S^(v) for all v £ V that 
are reachable from {r} U S r - The algorithm is described as 
Algorithm [3] Basically we conduct a BFS from r computing 
sets S' 1 and 5°. 

Let v be a vertex. Suppose that we have already computed 
Sr 1 {w) for all w such that dG{r,w) < dc(r,v). We can 
compute Sr (v) as follows, 

{u e S r \ u £ Sr (w),w £ Ng(v), da(r,w) — dG(r, v) — 1} , 

since if u is in S^ (v), (Ig{u,v) = dG{r,v) — 1 and therefore 
u is on one of the shortest paths from r to v. Similarly, 
assuming that we have already computed Sr 1 (w) for all w 
such that dG(r,w) < dG{r,v) and Sj?(w) for all w such that 
dG(r,w) < dG(r,v), we can compute Sr(v) as follows, 

{u £ S r \ u £ S r {w),w £ NG(v),dG{r,w) — dc(r, v) — 1} 
U {u £ S r I u £ 5,7 (w),w £ NG(v),dc{r,w) — do(r, v)} . 

Therefore, along with the breadth-first search, we can 
compute S^ 1 and S® alternately by dynamic programming 
in the increasing order of distance from r. That is, first we 
compute S^ 1 (u) for all v £ V such that dc(r,v) = 1, next 
we compute S^(v) for all v £ V such that dc{r, v) = 1, then 
we compute Sr (v) for all v £ V such that da(r,v) = 2, 
next we compute Sy ( v ) f° r all v £ V such that da(r, v) = 2, 
and so on. Note that operations on sets can be done in O(l) 
time by representing sets by bit vectors and using bitwise 
operations. 

5.3 Bit-parallel Distance Querying 

To process a distance query between a pair of vertices 
s and t, as with normal labels, we scan bit-parallel labels 
Lbp(s) and LBp(t). For each pair of quadruples that share 
the same root vertex, (r, S rs , S~ 1 (s), S° (s)) £ Lbp(s) and 
(r, Srt, SJ" 1 (t), Sr(t)) £ Lbp(£), from these quadruples we 
compute distance between s and t via one of vertices in {r}U 
S r - That is, we compute 5 — min {dc(s, u) + dc(u,t)}. 

u€{r}US r 

A naive way is to compute dc(s,w) and da(u,t) for all u 
and take the minimum, which takes 0(|5 r |) time. However, 
we propose an algorithm to compute 5 in O(l) time by ex- 
ploiting bitwise operations. 

Let S = dc(s,r) + dc(r,t). Since 8 is an upper bound on 
5 and dc{s,u) > do(s,r) — l,dc{u,t) > da(r,t) — 1 for all 
u £ S r , 5 — 2 < 5 < 5. Therefore, what we have to do is to 
judge whether the distance 8 is 5 — 2, 5 — 1 or 5. 

This can be done as follows. If S~ 1 {s) n S' 1 ^) / 0, then 
8 = 8-2. Otherwise, if S°(s) n S' 1 ^) ± or S~ x {i) n 
Sr(t) 7^ 0, then 8 = 8 — 1, and otherwise 8 = 8. Note that 
computing intersections of sets can be done by bitwise AND 
operations. Therefore, all these operations can be done in 
O(l) time. Thus, the distance 8 can be computed in O(l) 
time, and, in total, we can answer each query in 0(|Lbp(s)| + 
|Lbp(£)|) time. 

5.4 Introducing to Pruned Labeling 

Now we discuss how to combine this bit-parallel labeling 
methods and the pruned labeling method discussed in Sec- 
tion [472J We propose a simple and efficient way as follows. 



First we conduct bit-parallel BFSs without pruning for t 
times, where t is a parameter. Then, we conduct pruned 
BFSs using both the bit-parallel labels and normal labels 
for pruning. 

This method exploits different strength of the pruned la- 
beling method and the bit-parallel labeling method. In the 
beginning, pruning does not work much and pruned BFSs 
visits large portion of the vertices. Therefore, instead of 
pruned labeling, we use bit-parallel labeling without pruning 
to efficiently cover a larger part of pairs of vertices. Skip- 
ping the overhead of vain pruning tests also contributes the 
speed-up. 

As roots and neighbor sets for bit-parallel BFSs, we pro- 
pose to greedily use vertices with the highest priority: we 
select a vertex with the highest priority as the root r among 
remaining vertices, and we select up to 6 vertices with the 
highest priority as the set S r among remaining neighbors. 

As we see in the experimental results in Section [7] this 
method improves the preprocessing time, the index size and 
the query time. Moreover, as we also confirm in the ex- 
periments, if we do not set too large value as t, at least it 
does not spoil the performance. Therefore we do not have 
to be too serious about finding a proper value for t, and our 
method is still easy to use. 

6. VARIANTS AND EXTENSIONS 

Shortest-Path Queries: To answer not only distances but 
also shortest-paths, we store sets of tuples instead of pairs 
as labels. Label L(v) is a set of triples (u, 5 uv ,p U v), where 
Puv 6 V is the parent of u in the pruned breadth-first search 
tree rooted at u created by the pruned BFS from u. We can 
restore the shortest path between v and u by ascending the 
tree from v to the parents. 

Weighted Graphs: To treat weighted graphs, the only 
necessary change is to perform pruned Dijkstra's algorithm 
instead of pruned BFSs. Bit-parallel labeling cannot be used 
for weighted graphs. 

Directed Graphs: To treat directed graphs, we first rede- 
fine dc(u, v) as the distance from u to v. Then, we store two 
labels Lovt(v) and Lin(v) for each vertex. Label Lqvt{v) 
is a set of pairs (u,5 vu ), where u £ V and S vu — dc(v,u), 
and Label Lw(v) is a set of pairs (u, 5 UV ), where u £ V and 
S U v = dc(u,v). We can answer the distance from vertex s 
to vertex t by Lout(s) and Lw(t). To compute these labels, 
from each vertex, we conduct pruned BFSs twice: once in 
the forward direction and once in the reverse direction. 

Disk-based Query Answering: To answer a distance 
query, our querying algorithm only refers to two contiguous 
regions. Thus, if the index is disk resident, we can answer 
queries with two disk seek operations, which would be still 
much faster than an in-memory BFS. 

7. EXPERIMENTS 

We conducted experiments on a Linux server with Intel 
Xeon X5670 (2.93 GHz) and 48GB of main memory. The 
proposed method was implemented in C+- K We used 8-bit 
integers to represent distances, 32-bit integers to represent 
vertices, and 64-bit integers to conduct bit-parallel BFSs. 
For vertex ordering, we mainly use the Degree strategy 
and we do not specify the vertex ordering strategy unless 
we use other strategies. For query time, we generally report 
the average time for 1,000,000 random queries. 



Table 4: Datasets 



Dataset 


Network 


\y\ 


\E\ 


Gnutella 


Computer 


63 K 


148 K 


Epinions 


Social 


76 K 


509 K 


Slashdot 


Social 


82 K 


948 K 


Notredame 


Web 


326 K 


1.5 M 


WikiTalk 


Social 


2.4 M 


4.7 M 


Skitter 


Computer 


1.7 M 


11 M 


Indo 


Web 


1.4 M 


17 M 


MetroSec 


Computer 


2.3 M 


22 M 


Flickr 


Social 


1.8 M 


23 M 


Hollywood 


Social 


1.1 M 


114 M 


Indochina 


Web 


7.4 M 


194 M 



7.1 Datasets 

To show the efficiency and robustness of our method, we 
conducted experiments on various real-world networks: five 
social networks, three web graphs and three computer net- 
works. We treated all the graphs as undirected, unweighted 
graphs. Basically we used five smaller datasets to compare 
the performance between the proposed method and previ- 
ous methods and to analyze the behavior of these methods, 
and used larger six datasets to show the scalability of the 
proposed method. The types of networks, the numbers of 
vertices and edges are presented in Table 3] 

7.1.1 Detailed Description 

Gnutella: This is a graph created from a snapshot of the 
Gnutella P2P network in August 2002 [33] . 
Epinions: This graph is the on-line social network in Epin- 
ions (www.epinions.com), where each vertex represents a 
user and each edge represents a trust relationship [33] . 

Slashdot: This is the on-line social network in Slashdot 
(slashdot . org) obtained in February 2009. Vertices cor- 
respond to users and edges correspond to friend/foe links 
between the users |23| . 

NotreDame: This is a web graph between pages from Uni- 
versity of Notre Dame (domainnd.edu) collected in 1999 [5]. 

WikiTalk: This is the on-line social network among editors 
of Wikipedia (www.wikipedia.org) created by communica- 
tion on edits on talk pages by till January 2008 |21H20j . 

Skitter: This is an Internet topology graph created from 
traceroutes run in 2005 by Skitter [22] , 

Indo: This is a web graph between pages in . in domain 
crawled in 2004 



MetroSec: This is a graph constructed from Internet traffic 
captured by MetroSec. Each vertex represents a computer 
and two vertices are linked if they appear in a packet as 
sender and destination |24] , 

Flickr: This is the on-line social network in a photo-sharing 
site, Flickr (www.flickr.com) |26| . 

Indochina: This is a web graph of web pages in the country 
domains of Indochina countries, crawled in 2004 [9][8] . 

Hollywood: This is a social network of movie actors. Two 
actors are linked if they appeared in a movie together by 
2009 H[8]. 

7.1.2 Statistics 

First, we investigated the degree distribution of the net- 
works, since degrees of vertices play important roles in our 
method when we use Degree strategy for vertex ordering. 
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Figure 2: Properties of the datasets. 

Table 3: Performance comparison between the proposed method and previous methods for the real-world datasets. IT denotes 
indexing time, IS denotes index size, QT denotes query time, and LN denotes average label size for each vertex. DNF means 
it did not finish in one day or ran out of memory. 





Pruned Landmark Labeling 


Hierarchical Hub Labeling 


12) 


Tree Decomposition [4] 


BFS 




IT 


IS 


QT 


LN 


IT 


IS 


QT 


LN 


IT 


IS 


QT 


Gnutella 


54 s 


209 MB 


5.2 fis 


437+16 


245 s 


380 MB 


11 fis 


1,275 


209 s 


68 MB 


19 ^s 


3.2 ms 


Epinions 


1.7 s 


32 MB 


0.5 fis 


7+16 


495 s 


93 MB 


2.2 fis 


256 


128 s 


42 MB 


11 fis 


7.4 ms 


Slashdot 


6.0 s 


48 MB 


0.8 fis 


14+16 


670 s 


182 MB 


3.9 fis 


464 


343 s 


83 MB 


12 fis 


12 ms 


NotreDame 


4.5 s 


138 MB 


0.5 fis 


29+16 


10,256 s 


64 MB 


0.4 fis 


41 


243 s 


120 MB 


39 fis 


17 ms 


WikiTalk 


61 s 


1.0 GB 


0.6 fis 


9+16 


DNF 


- 


- 


- 


2,459 s 


416 MB 


1.8 fis 


197 ms 


Skitter 


359 s 


2.7 GB 


2.3 fis 


123+64 


DNF 


- 


- 


- 


DNF 


- 


- 


190 ms 


Indo 


173 s 


2.3 GB 


1.6 fis 


133+64 


DNF 


- 


- 


- 


DNF 


- 


- 


150 ms 


MetroSec 


108 s 


2.5 GB 


0.7 fis 


19+64 


DNF 


- 


- 


- 


DNF 


- 


- 


150 ms 


Flickr 


866 s 


4.0 GB 


2.6 fis 


247+64 


DNF 


- 


- 


- 


DNF 


- 


- 


361 ms 


Hollywood 


15,164 s 


12 GB 


15.6 fis 


2,098+64 


DNF 


- 


- 


- 


DNF 


- 


- 


1.2 s 


Indochina 


6,068 s 


22 GB 


4.1 fis 


415+64 


DNF 


- 


- 


- 


DNF 


- 


- 


1.5 s 



Figures [2al and l2bl are the log- log plot of degree complemen- 
tary cumulative distribution. As expected, we can confirm 
that all these networks generally exhibit power-law degree 
distributions. 

Then, we also examined the distribution of distances. Fig- 
ures [2c] and [2d] show distribution of distances for 1,000,000 
random pairs of vertices. As we can observe from these fig- 
ures, these networks are also small- world networks, in the 
sense that the average distance is very small. 

7.2 Performance 

First we present the performance of our method on the 
real-world datasets to show the efficiency and robustness of 
our method. Table [3] shows the performance of our method 
for the datasets. IT denotes preprocessing time, IS denotes 
index size, QT denotes average query time for 1,000,000 
random queries, and LN denotes the average label size for 
each vertex, in the format of the size of normal labels (left) 
plus the size of bit-parallel labels (right). We set the number 
of times we conduct bit-parallel BFSs as 16 for first five 
datasets and 64 for the rest. 

In Table [3] we also listed the performance of two of the 
state-of-the-art existing methods. One is hierarchical hub 
labeling [2], which is also based on distance labeling. The 
other one is based on tree decompositions [4], which is an 
improved version of TEDI [41]. For these previous methods, 
we used the implementations by the authors of these meth- 
ods, both in CH — h Experiments for hierarchical hub labeling 
were conducted on a Windows server with two Intel Xeon 
X5680 (3.33GHz) and 96GB of main memory. Experiments 
for the tree-decomposition-based method were conducted on 
our environment described above. We also described the av- 
erage time to compute distance by breadth-first search for 



1,000 random pairs of vertices. Among these four methods 
including the proposed method, only the preprocessing of 
hierarchical hub labeling [2] was parallelized to use all the 
12 cores. All the other timing results are sequential. 

7.2.1 Preprocessing Time and Scalability 

Our emphasis is particularly on the large improvement in 
the preprocessing time, leading to much better scalability. 
First, we successfully preprocessed the largest two datasets 
Hollywood and Indochina with millions of vertices and hun- 
dreds of millions of edges in moderate preprocessing time. 
This is improvement of two orders of magnitude on the graph 
size we can handle since, as we listed in Table [TJ other exist- 
ing exact distance querying methods take thousands or tens 
of thousands of seconds to preprocess graphs with millions 
of edges. 

For next four datasets with tens of millions of edges, it 
took less than one thousand seconds, while the previous 
methods did not finish after one day or ran out of mem- 
ory. For smaller six datasets, they took at most one minute, 
and about at least 50 times faster than the previous methods 
for the most of them. 

7.2.2 Query Time 

The average query time was generally microseconds and 
at most 16 microseconds. For almost all the smaller five 
datasets, the query time of the proposed method is faster 
than the query time of the previous methods. Indeed, from 
Table [l] we can also observe that the query time of our 
method is comparable to all the existing methods for graphs 
of these sizes. Moreover, we can confirm that the query time 
does not increase much for larger networks. 
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7.2.3 Index Size 

As for the smaller five networks, results demonstrate that 
our method is comparable to the previous methods with 
respect to index size. However, even though nowadays com- 
puters with tens of gigabytes of memory are neither rare nor 
expensive, reducing the index size can be an important next 
research issue. 

7.3 Analysis 

Next we analyze the behavior of our method to investigate 
why our method is efficient. 

7.3.1 Pruned BFS 

First we study how labels are computed and stored. Fig- 
ure[3a]shows the number of distances added to labels in each 
pruned BFS, and Figure l3bl shows the cumulative distribu- 
tion of it, that is, the ratio of the distances stored no later 
than each step to all the distances stored in the end. We did 
not use bit-parallel BFSs for these experiments. 

From these figures, we can confirm the large impact of 
the pruning. Figure [3a] shows that the number of distances 
added to labels in each BFS decreases so rapidly. For ex- 
ample, after 1,000 times of BFSs, for all the three datasets 
distances are added to the labels of only less than 10% of 
the vertices, and after conducting 10,000 times of BFSs, for 
all the three datasets distances are added to the labels of 
only less than 1% of the vertices. Figure l3bl also shows that 
large portion of the labels are computed in the beginning. 

7.3.2 Sizes of Labels 

Figure [3c] shows the distribution of the sizes of labels after 
the whole preprocessing, sorted in the ascending order of 
sizes. We can observe that the size of a label each vertex 
has do not differ much for different vertices, and few vertices 
have much larger labels than the average. This shows that 
the query time of our method is quite stable. 



If you are anxious about vertices with unusually large la- 
bels, you can precompute the distance between these vertices 
and all the vertices and answer it directly, since the number 
of such vertices are few as shown in Figure [5c] 

7.3.3 Pair Coverage 

Figure [Ja] illustrates the ratio of the covered pairs of ver- 
tices, that is, the pairs of vertices whose distances can be 
answered correctly by current labels, at each step. We used 
1,000,000 random pairs to estimate these ratios. We can ob- 
serve that most pairs are covered in the beginning. This 
shows that such a large portion of pairs have the short- 
est paths that pass such a small portion of central vertices, 
which are selected by the Degree strategy. This is the rea- 
son why landmark-based approximate methods have good 
precision, and also the reason why our pruning works so 
effectively. 

Figures I4bl [4cl and I4dl illustrate the ratio of the covered 
pairs of vertices at each step with pairs classified by dis- 
tance. They show that generally distant pairs are covered 
earlier than close pairs. This is the reason why the preci- 
sion of landmark-based approximate methods for close pairs 
are far worse than the precision for distant pairs. On the 
other hand, our method aggressively exploits this property: 
because distant pairs are covered in the beginning, we can 
prune distant vertices when processing other vertices, which 
results in fast preprocessing. 



7.3.4 Vertex Ordering Strategies 

Next we see the effect of vertex ordering strategies. 



Ta- 



ble [5] describes the average size of a label for each vertex 
using different vertex ordering strategies described in Sec- 
tion [4]4] We did not use bit-parallel BFSs for these exper- 
iments. As we can see, results are not so different between 
the Degree strategy and the Closeness strategy. The De- 
gree strategy might be slightly better. On the other hand, 
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Figure 5: Performance against number of bit-parallel BFSs. 



Table 5: Average size of a label for each vertex against dif- 
ferent vertex ordering strategies. 



Dataset 


Random 


Degree 


Closeness 


Gnutella 


6,171 


781 


865 


Epinions 


7,038 


121 


132 


Slashdot 


8,665 


216 


234 


NotrcDame 


DNF 


60 


82 


WikiTalk 


DNF 


118 


158 



the result of the Random strategy is much worse than other 
two strategies. This shows that by the Degree and Close- 
ness strategies we can successfully capture central vertices. 

7.3.5 Bit-parallel BFS 

Finally, we see the effect of bit-parallel BFSs discussed in 
Section [5] Figure [5] shows the performance of our method 
against different number of times we conduct bit-parallel 
BFSs. 

Figure [5a] illustrates preprocessing time. It shows that, 
with a proper number of bit-parallel BFSs, preprocessing 
time gets two to ten times faster, resulting in the further 
enhancement to the scalability of our method. Figure I5bl 
illustrates query time. We can confirm that query time also 
gets faster. Figure I5c1 shows the average size of a normal la- 
bel for each vertex. As we increase the number of bit-parallel 
BFSs, many pairs are covered by special labels computed by 
bit-parallel BFSs, and the size of normal labels decreases. 
Figure I5dl shows the index size. With a proper number of 
bit-parallel BFSs, index size also decreases. 

Another important finding from these figures is that the 
performance of our method is not too sensitive to the pa- 
rameter of the number of bit-parallel BFSs. As they show, 
the performance of our method does not become worse much 
unless we choose a too big number. The proper parameters 
seem to common between different networks. Therefore, our 
method still is easy to use with this bit-parallel technique. 

8. CONCLUSIONS 

In this paper, we proposed a novel and efficient method for 
exact shortest-path distance queries on large graphs. Our 
method is based on distance labeling to vertices, which is 
common to the existing exact distance querying methods, 
but our labeling algorithm stands on a totally new idea. 
Our algorithm conducts breadth-first search (BFS) from all 
the vertices with pruning. Though the algorithm is simple, 
our pruning surprisingly reduce the search space and the 
labels, resulting in fast preprocessing time, small index size 
and fast query time. Moreover, we also proposed another la- 
beling scheme exploiting bit-level parallelism, which can be 
easily combined with the pruned labeling method to further 



improve the performance. Extensive experimental results 
on large-scale real-world networks of various types demon- 
strated the efficiency and robustness of our methods. In 
particular, our method can handle networks with hundreds 
of millions of vertices, which are two orders of magnitude 
larger than the limits of the previous methods, with compa- 
rable index size and query time. 

We plan to investigate ways to handle even larger graphs, 
where indices and/or graphs might not fit in main mem- 
ory. The first way is to reduce the index size by reducing 
graphs exploiting obvious parts and symmetry [301114] and 
compressing labels by making dictionaries of common sub- 
trees for shortest path trees pQ. Another way is disk-based 
or distributed implementation. As we stated in Section [5] 
disk-based query answering is obvious and ready, and the 
challenges are particularly on preprocessing. However, since 
our preprocessing algorithm is a simple algorithm based on 
BFS, we can leverage the large body of existing work on 
BFS. In particular, since pruning can be done locally, the 
preprocessing algorithm would perform well on BSP-model- 
based distributed graph processing platforms [25] . 
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